A hybrid approach to online speaker diarization
نویسندگان
چکیده
This article presents a low-latency speaker diarization system (“who is speaking now?”) based on a hybrid approach that combines a traditional offline speaker diarization system (“who spoke when?”) with an online speaker identification system. The system fulfills all requirements of the diarization task, i.e. it does not need any a-priori information about the input, including no specific speaker models. After an initialization phase the approach allows a low-latency decision on the current speaker with an accuracy that is close to the underlying offline diarization system. The article describes the approach, evaluates the robustness of the system, and analyzes the latency/accuracy trade-off.
منابع مشابه
Semi-supervised On-line Speaker Diarization for Meeting Data with Incremental Maximum A-posteriori Adaptation
Almost all current diarization systems are off-line and illsuited to the growing need for on-line or real-time diarization. Our previous work reported the first on-line diarization system for the most challenging speaker diarization domain involving meeting data captured with a single distant microphone (SDM). Even if results were not dissimilar to those reported for online diarization in less ...
متن کاملOnline two speaker diarization
Short conversations pose some challenges for online diarization due to data sparseness and unbalanced representation of the two speakers. This paper presents our recent advances in online diarization of two-wire telephone conversations, introducing several methods for improving processing efficiency and accuracy on short conversations. Our framework is based on the offline diarization of a conv...
متن کاملUsing a GPU, Online Diarization = Offline Diarization
This article presents a low-latency, online speaker diarization system (“who is speaking now?”) based on the repeated execution of a GPU-optimized, highly efficient offline diarization system (“who spoke when”). The system fulfills all requirements of the diarization task, i.e., it does not require any a priori information about the input, including specific speaker models. In contrast to earli...
متن کاملConfidence for Speaker Diarization using PCA Spectral Ratio
Confidence scoring is an important component in speaker diarization systems, both for offline speech analytics and for online diarization that are required to produce the speaker segmentation from very little audio. This paper proposes a confidence measure for speaker diarization based on the spectral ratio of the eigenvalues of the Principal Component Analysis (PCA) transformation computed on ...
متن کاملOnline Diarization of Telephone Conversations
Speaker diarization systems attempts to perform segmentation and labeling of a conversation between R speakers, while no prior information is given regarding the conversation. Diarization systems basically tries to answer the question ”Who spoke when?”. In order to perform speaker diarization, most state of the art diarization systems operate in an off-line mode, that is, all of the samples of ...
متن کامل